Proximity Search in Databases

نویسندگان

  • Roy Goldman
  • Narayanan Shivakumar
  • Suresh Venkatasubramanian
  • Hector Garcia-Molina
چکیده

An information retrieval (IR) engine can rank documents based on textual proximity of keywords within each document. In this paper we apply this notion to search across an entire database for objects that are \near" other relevant objects. Proximity search enables simple \focusing" queries based on general relationships among objects, helpful for interactive query sessions. We view the database as a graph, with data in vertices (objects) and relationships indicated by edges. Proximity is dened based on shortest paths between objects. We have implemented a prototype search engine that uses this model to enable keyword searches over databases, and we have found it very e ective for quickly nding relevant information. Computing the distance between objects in a graph stored on disk can be very expensive. Hence, we show how to build compact indexes that allow us to quickly nd the distance between objects at search time. Experiments show that our algorithms are e cient and scale well.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A System for Keyword Proximity Search on XML Databases

Keyword proximity search is a user-friendly information discovery technique that has been extensively studied for text documents. In extending this technique to structured databases, recent works [6, 7, 4, 2] provide keyword proximity search on labeled graphs. A keyword proximity search does not require the user to know the structure of the graph, the role of the objects containing the keywords...

متن کامل

Using Transformation Techniques Towards Efficient Filtration of String Proximity Search of Biological Sequences

The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT(Discrete Fourier Transformation) and DWT(Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA ...

متن کامل

Filtration of String Proximity Search via Transformation

The problem of proximity search in biological databases is addressed. We study vectortransformations and conduct the application of DFT(Discrete Fourier Transformation) andDWT(Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNAsequence proximity search to reduce the search time of range queries. Our empirical results on anumber of Prokaryote and Eu...

متن کامل

ICRA: Effective Semantics for Ranked XML Keyword Search

Keyword search is a user-friendly way to query XML databases. Most previous efforts in this area focus on keyword proximity search in XML based on either tree data model or graph (or digraph) data model. Tree data model for XML is generally simple and efficient for keyword proximity search. However, it cannot capture connections such as ID references in XML databases. In the contrast, technique...

متن کامل

Efficient Group of Permutants for Proximity Searching

Modeling proximity searching problems in a metric space allows one to approach many problems in different areas, e.g. pattern recognition, multimedia search, or clustering. Recently there was proposed the permutation based approach, a novel technique that is unbeatable in practice but difficult to compress. In this article we introduce an improvement on that metric space search data structure. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998